Preoperative Cervical Lymph Node Metastasis Prediction in Papillary Thyroid Carcinoma: A Noninvasive Clinical Multimodal Radiomics (CMR) Nomogram Analysis

This study aimed to evaluate the feasibility of applying a clinical multimodal radiomics nomogram based on ultrasonography (US) and multiparametric magnetic resonance imaging (MRI) for the prediction of cervical lymph node metastasis (LNM) in papillary thyroid carcinoma (PTC) preoperatively. We performed retrospective evaluations of 133 patients with pathologically confirmed PTC, who were assigned to the training cohort and validation cohort (7 : 3), and extracted radiomics features from the preoperative US, T2-weighted (T2WI),diffusion-weighted (DWI), and contrast-enhanced T1-weighted (CE-T1WI) images. Optimal subsets were selected using minimum redundancy, maximum relevance, and recursive feature elimination in the support vector machine (SVM). For LNM prediction, the radiomics model was constructed by SVM, and Multi-Omics Graph cOnvolutional NETworks (MOGONET) was used for the effective classification of multiradiomics data. Multivariable logistic regression incorporating multiradiomics signatures and clinical risk factors was used to generate a nomogram, whose performance and clinical utility were assessed. Results showed that the nine most predictive features were separately selected from US, T2WI, DWI, and CE-T1WI images, and 18 features were selected in the combined model. The combined radiomics model showed better performance than models based on US, T2WI, DWI, and CE-T1WI. In a comparison of the combined radiomics and MOGONET model, receiver operating curve analysis showed that the area under the curve (AUC) value (95% CI) was 0.84 (0.76–0.93) and 0.84 (0.71–0.96) for the MOGONET model in the training and validation cohorts, respectively. The corresponding values (95% CI) for the combined radiomics model were 0.82 (0.74–0.90) and 0.77 (0.61–0.94), respectively. The MOGONET model had better performance and better prediction specificity compared with the combined radiomics model. The nomogram including the MOGONET signature showed a better predictive value (AUC: 0.81 vs. 0.88) in the training and validation (AUC: 0.74vs. 0.87) cohorts, as compared with the clinical model. Calibration curves showed good agreement in both cohorts. The applicability of the clinical multimodal radiomics (CMR) nomogram in clinical settings was validated by decision curve analysis. In patients with PTC, the CMR nomogram could improve the prediction of cervical LNM preoperatively and may be helpful in clinical decision-making.


Introduction
In the past three decades, papillary thyroid cancer (PTC) incidence has continued to increase worldwide [1,2]. Furthermore, PTC is the most commonly seen histology (89.1%) in thyroid cancer, and its incidence-based mortality rates continue to increase [3]. However, the mortality rates of PTC (0.3 per 100,000 in men and 0.5 per 100,000 in women) are very low [1]. Although PTC is an indolent tumor with a good prognosis, 30%-80% of cases may show cervical lymph node metastasis (LNM), which is extremely common (incidence rate, up to 41.3%) in papillary thyroid microcarcinoma [4]. Cervical LNM frst develops in the central neck region corresponding to cervical level VI, and then in the lateral neck. Cervical LNM in PTC is an important factor determining the approach of surgery total thyroidectomy or lobectomy, bilateral or ipsilateral central node dissection (CND). It is also an independent factor infuencing the risk of poor prognosis and local recurrence of PTC [5][6][7] and the most important factor predicting a high risk of lateral LNM [4]. Many PTC patients undergo procedures such as total thyroidectomy and CND to address the risk of cervical LNM, frequently resulting in overtreatment [8].
Although CND improves disease-specifc survival and reduces local recurrence in cases of LNM [9], prophylactic CND has been reported to not improve long-term outcomes and is related to high hypoparathyroidism rates [10]. Given the increasing awareness about the substantial impact of PTC overdiagnosis, the guidelines of the American Tyroid Association (ATA) recommended and advocated thyroid lobectomy alone and active surveillance as initial treatments for low-risk PTC patients [11]. Furthermore, preoperative examinations should be improved to more accurately identify patients with high-risk PTC and provide individualized treatments.
Preoperative ultrasound (US) is useful for assessing lateral cervical LNM among patients with PTC [12]. However, its sensitivity for evaluating central cervical LNM is only 30%-50% [12,13]. Contrast-enhanced US and elastosonography have been shown to be superior to conventional US [13,14]. However, US evaluation is dependent on the operator and may not provide adequate visualization of deep anatomical structures and the structures that are obscured by the bone or air acoustically. Computed tomography (CT) shows greater sensitivity than the US for detecting central cervical LNM but lower sensitivity in predicting lateral cervical LNM. Tus, a noninvasive and efective approach to predict cervical LNM risk in PTC is essential for guiding diagnosis and treatment.
Risk analysis for the prediction of cervical LNM among patients with PTC has been proposed in several studies, and tumor size, location, extrathyroidal extension, and microcalcifcation were found to be independent risk factors of cervical LNM [4,6,7,15]. Several cervical LNM prediction models were constructed by combining the above risk factors with US features [16][17][18]. Recently, radiomics has received great attention for its potential to facilitate accurate diagnosis. CT-and US-based radiomics approaches for predicting cervical LNM among patients with PTC have been reported in several studies [19][20][21]. A previous study confrmed that MRI-based radiomics showed good performance for preoperative cervical LNM prediction among patients with PTC, with an area under the receiver operating characteristic (ROC) curve (AUC) of 0.835 and 0.830 in the training and validation groups, respectively [22]. However, all of the aforementioned studies were conducted with a single imaging modality and used diferent methods for feature extraction and model construction. Combining features from multiple imaging modalities may further improve the performance of the radiomics model in preoperative cervical LNM prediction; however, there are few reports on the comparison between radiomics models based on diferent imaging modalities and multiple imaging modalities for predicting cervical LNM in PTC.
Te widespread use of high-throughput technologies has led to the emergence of multiomics integrative analysis approaches. Researchers can obtain omics data at scale from diferent molecular levels such as the genome, transcriptome, proteome, interactome, epigenome, metabolome, liposome, and microbiome to advance the understanding of biological processes and molecular mechanisms.
Multi-Omics Graph cOnvolutional NETworks (MOGONET), a novel multiomics integrative method, was recently proposed by Wang et al. [23]. As a supervised algorithm based on a graph network, MOGONET outperforms other multiomics integration methods and is efective for multiomics data classifcation. In this study, we applied this method to a multiradiomics model based on US and multiparametric MRI data of thyroid neoplasms to predict cervical LNM among patients with PTC and compared the performance of the multiradiomics model constructed using MOGONET and support vector machine (SVM) followed by the construction of a predictive nomogram. We hypothesize that based on its superior performance in previous biomedical classifcations, the MOGONET multiradiomics model may show a better ability to predict cervical LNM than that of traditional radiomics and clinical statistical models.

Patients.
Te ethics committee of Minhang Hospital, Fudan University School of Medicine, approved this study. All participants provided written informed consent before US and MRI examinations. Tis retrospective review was conducted using the data of 268 consecutive patients who presented with pathologically confrmed PTC at our hospital from January 1, 2017 to December 31, 2021. Te inclusion criteria were as follows: (1) pathologically confrmed PTC; (2) receipt of neck lymph node dissection and preoperative MRI and US examinations; (3) no previous biopsy or surgery of the thyroid; and (4) no history of neck cancer or radiation therapy. Te exclusion criteria were as follows: (1) maximum tumor diameter of <5 mm; (2) nonreceipt of lymph node dissection; (3) poor MRI quality; (4) measuring lines on US images; and (5) inconsistency between MR and US images. Finally, 133 patients were included; 58 (43.6%) and 75 (56.4%) patients did not have cervical LNM (non-LNM group) and had pathologically confrmed cervical LNM (LNM group), respectively. Te patient selection fow chart is shown in Supplementary Figure 1.

Feature Extraction and Selection.
All images were normalized before the feature extraction procedure. Te following features were extracted using the PyRadiomics package (3.0.1) [24] implemented in Python: gray-level cooccurrence matrix (GLCM), gray-level size zone matrix (GLSZM), gray-level run-length matrix (GLRLM), frstorder statistics, Laplacian of Gaussian (LoG), and wavelet. Data were randomly divided into the training and validation groups at a ratio of 7 : 3. Two feature selection methods were applied. First, redundant features were eliminated, and features showing a high correlation with the labels were retained using minimum redundancy maximum relevance (mRMR). Twenty features were retained. Subsequently, the recursive feature elimination (RFE) algorithm was used to fnd a subset of predictors that could be used to produce an accurate model by the backward selection of predictors based on predictor importance ranking. Te predictors were ranked, and the less important ones were sequentially eliminated before modelling.

Radiomics Model Construction and Nomogram
Development. First, radiomics models based on US (USradiomics), T2WI (T2WI-radiomics), DWI (DWIradiomics), and CE-T1WI (CE-T1WI-radiomics) were constructed using SVM. Ten, a multiparametric radiomics model was established by integrating these four image modalities by using SVM and MOGONET. Compared with the traditional machine learning classifcation method, MOGONET utilizes the advantage of each imaging modality and considers the correlations among samples analyzed by similarity networks of graph convolutional networks (GCN) to obtain imaging modality-specifc GCNs. Next, the modality-specifc GCNs were fed into a cross-image modality discovery tensor to explore the cross-image modality correlation at label space. Ten, the View Correlation Discovery Network (VCDN) was used for efective multi-image modality integration to obtain the fnal prediction with the cross-image modality discovery tensor. A nomogram for cervical LNM prediction was developed based on clinical risk factors as well as the prediction performed using MOGONET by stepwise multivariate logistic regression analyses.

Statistical Analysis.
Continuous variables are presented as mean ± standard deviation values, and categorical variables are shown as counts (percentages). Te Chi-square or Fisher's exact test was employed for the comparison of categorical variables. Te t-test or Mann-Whitney test was employed for the comparison of continuous variables depending on data distribution. Te performance of the LNM prediction models was evaluated by ROC analysis, and the sensitivity (SEN), specifcity (SPE), positive predictive value (PPV), negative predictive value (NPV), accuracy (ACC), and AUC were recorded. Te nomogram performance was also evaluated using ROC analysis. DeLong's test was applied for comparison between ROC curves, and net reclassifcation improvement (NRI) and integrated discrimination improvement (IDI) were calculated. Te Hosmer-Lemeshow test was used to assess the nomogram's goodness-of-ft. Finally, the decision curve analysis (DCA) was performed for evaluating the nomogram's clinical utility. IBM SPSS Statistics 26.0 (IBM Corp, Armonk, NY, USA), R software (version4.1.3; https://www.r-project.org/), and Python (version 3.5.6; https://www.python.org/) were used for all the statistical analyses. Te "mRMR" algorithm in the "mRMRe" package was used to employ the maximum relevance minimum redundancy algorithm to initially screen the radiomics features. Te best feature cohort was selected by the "glmnet" algorithm in the "glmnet" package. ROC analysis was conducted based on the "pROC" package to evaluate efectiveness. Te "caliplot2" function in the "ModelGood" package was applied to plot the calibration curves, and decision curves were plotted using on "rmda" package. Te MOGONET algorithm we used was shown in the literature. A two-tailedp value of <0.05 was considered, as statistical signifcance.

Patient Characteristics.
Te study population included 133 patients with PTC (40 males, 93 females; age 44.69 ± 13.50 years; age range, 13-77 years). Te incidence rate of cervical LNM was 56.39% (75/133). Te patients' detailed clinical characteristics are summarized in Table 1. None of the clinical characteristics was signifcantly diferent between the training and validation cohorts. Te associations between clinical characteristics and LNM in the training and validation cohorts are presented in Table 2. Te non-LNM and LNM groups in both cohorts showed no signifcant diferences in age, sex, and the number of lesions. Tumor diameter was larger in the LNM group in both cohorts. Te LNM group showed a greater frequency of bilateral and multifocal PTCs (training cohort: p < 0.05; validation cohort: p > 0.05). In both cohorts, the LNM group had a markedly higher frequency of thyroid contour protrusion with a more poorly defned tumor margin (p < 0.001). Similarly, in both cohorts, microcalcifcation was more commonly seen in the LNM group (p < 0.05). Te training cohort showed signifcant diferences in the incidence of an aspect ratio of >1 between the two groups (p � 0.004).

Performance of the Radiomics Models.
After the interobserver ICC analysis, high-throughput features were extracted, 740 from US images, 1045 from T2WI images, 1045 from DWI images, and 785 from CE-T1WI images. Eventually, the nine most predictive subset features were separately selected from US, DWI, T2WI, and CE-T1WI images, and 18 subset features were selected from combined images. Feature importance was evaluated. Supplementary Figure 2 presents the selected features and their importance. Figure 1 shows the ROC curves for the radiomics models in distinguishing the LNM group from the non-LNM group in both cohorts. Te combined radiomics model showed better performance than the other four radiomics models in both cohorts. Te AUC, ACC, SEN, SPE, PPV, and NPV of the six models are detailed in Table 3.  (Table 3).  (Table 4). Terefore, we constructed a nomogram for LNM prediction using these predictors (Figure 2(a)). Te ROC analysis showed that the AUC (95% CI) of the nomogram was 0.88 (0.81-0.95) and 0.87 (0.75-0.99) in the training and validation cohorts, respectively (Figure 2(b)), which was higher than those associated with the clinical model. Tis fnding suggests that this nomogram had the good discriminative ability. Te Hosmer-Lemeshow test showed good agreement between the ftting and observed values in both cohorts (all p > 0.05) (Figure 3). DeLong's test showed that the AUCs were not signifcantly diferent between the nomogram and clinical model in both cohorts (p > 0.05). Tere were signifcant diferences in NRI and IDI between the two groups. NRI (95% CI) and IDI (95% CI) were 0.96 (0.60-1.31; p ≤ 0.01) and 0.14 (0.07-0.21; p ≤ 0.01) in the training cohort and were 0.71 (0.14-1.29; p � 0.015) and 0.17 (0.0471-0.2891; p � 0.006) in the validation cohort.

Performance of the Predictive
DCA of the nomogram and the clinical model was performed to determine whether the nomogram can improve the net beneft for patients. Te DCA results indicated that the nomogram had a greater net beneft than the clinical models when the threshold probability was between 0 and 0.7 (Figure 4).

Discussion
In patients with PTC, cervical LNM indicates local recurrence risk and poor prognosis. We aimed to develop a useful tool based on multiradiomics data for predicting cervical LNM preoperatively. To this end, we constructed four radiomics models using T2WI, DWI, CE-T1WI, and US images, and one linear combination model based on multimodal images using a traditional classifcation algorithm (SVM). We also used MOGONET for classifer multiradiomics data and compared its performance with other models based on SVM. MOGONET showed better predictive performance (AUC >0.8 in both cohorts) than did the other models, suggesting that multiradiomics model could be invaluable for predicting cervical LNM in PTCs. Te DCA also validated the potential applicability of the nomogram incorporating the MOGONET model and clinical risk factors. Tis approach could be helpful for early medical management and avoid overdiagnosis and overtreatment in PTC.
Although the prognosis of PTC is much better than those of many other cancers, cervical LNM occurs in 30%-80% of the patients with PTC. Cervical LNM is an important consideration in surgical procedures and clinical management for patients with PTC. Nodal metastases most commonly occur at cervical level VI. Te ATA guidelines recommend therapeutic lymph node dissection for cN1 disease in cases of PTC. However, the role of prophylactic CCND for cN0 disease is extremely controversial. Some studies have suggested that prophylactic CCND could reduce local recurrence while improving the accuracy of recurrence risk assessment. In contrast, other studies have demonstrated that prophylactic CCND ofers no clear beneft for the long-term outcome and is associated with a higher potential for complications, including hemorrhage and injuries to the posterior recurrent nerve. Tese fndings highlight the need to develop a method showing improved accuracy for preoperative LNM prediction while more accurately identifying high-risk patients.
Although US is the preferred imaging modality for assessing thyroid lesions and cervical lymph nodes, US cannot adequately reveal the central region and shows limited ability to identify central cervical LNM. Moreover, operator-related diferences substantially infuence the accuracy of US-based diagnoses of cervical LNM.
Several recent studies on clinical prediction models have shown that complex echo patterns, posterior region homogeneity, microcalcifcations, extrathyroidal extension, capsule contact, age ≤45 years, and tumor size >1.0 cm were independent indicators of cervical LNM among patients with PTC. However, these predictors are subjective and showed variable sensitivity and specifcity in predicting cervical LNM among patients with PTC.
In comparison with conventional image analysis, radiomics features provide objective information about the lesion. Radiomics signatures have been shown to be useful for predicting LNM and prognosis in cancer studies. Moreover, radiomics has been used to predict LNM among patients with PTC. Liu et al. established a US-based radiomics model to predict cervical LNM among patients with PTC and reported AUCs of 0.78 in the training cohort and 0.73 in the validation cohort. A nomogram based on shear-wave elastography (SWE) radiomics also showed good calibration and discrimination ability (AUC � 0.83 in the test), demonstrating that SWE radiomics signature is a useful biomarker for cervical LNM prediction among patients with PTC [25]. Yu et al. [20] proposed the transfer learning radiomics for the LNM prediction model of PTC, and the model achieved an AUC of 0.93 and yielded more benefts than other methods. In comparison with qualitative CT image features, the radiomics signature of dual-energy CT iodine maps performed better in the preoperative diagnosis of cervical LNM of PTC [26]. In our previous study, we also demonstrated that radiomics based on multiparameter MRI could adequately predict cervical LNM in PTC patients (AUC � 0.83 in the test cohort) [22].
Te radiomics studies described above were conducted on the basis of single-modality medical images. As far as we know, no previous study has compared the performance of radiomics models based on diferent or multiple imaging modalities. Terefore, we constructed radiomics models based on US and MR images, including DWI, T2WI, and CE-T1WI sequences, and compared their predictive performance. In the validation cohort, the AUCs of the DWIradiomics, T2WI-radiomics, CE-T1WI-radiomics, and USradiomics models were 0.74, 0.52, 0.68, and 0.66,  Te existing classifcation methods based on supervised data integration includes the strategies based on feature concatenation and ensembles. In the methods based on concatenation, diferent types of omics data are integrated by directly concatenating the input data features to train the classifcation model. In contrast, predictions from diferent      classifcation results from diferent types of omics data yielded consistent improvements in classifcation performance.
Te logistic regression analyses suggested that in PTC patients, poorly defned margins, thyroid contour protrusion, and MOGONET scores were independent risk factors for LNM. Te AUC of the nomogram was higher than that of clinical models. Although DeLong's test showed no signifcant diferences between the ROC curves of the clinical models and the nomogram (p > 0.05), the small sample size may have infuenced this fnding. In contrast, NRI and IDI associated with the nomogram were signifcantly higher than those associated with the clinical models in both cohorts (p < 0.05), indicating an improved prediction probability. DCA analysis showed a net beneft of the nomogram.
Te study had several limitations. First, the data were obtained from a single centre and lacked external validation. Second, the sample size was small, and the prognostic value of the fndings should be further validated in the future. Tird, since this was a retrospective study, LN status was evaluated on the basis of postoperative pathology. Te largest long-axis cross-section image on the US was dependent on the operator, and we cannot ensure complete consistency between the largest long-axis cross-section images obtained using US and MRI. Tus, selection bias was inevitable and may have afected the results. Finally, US examinations were not performed on the same machine and by the same radiologist. Te resultant inconsistencies in inspection parameters may have afected the accuracy of the results and need to be verifed with large-sample multicentre data.

Conclusions
In conclusion, the MOGONET model integrated multiradiomics performed better in LNM prediction than radiomics constructed from single imaging modality data. Tis noninvasive clinical multimodal radiomics nomogram may facilitate clinical decision-making for patients showing PTC.

Data Availability
Te data used to support the fndings of this study are available from the corresponding author upon request.

Conflicts of Interest
Te authors declare that they have no conficts of interest.